Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datasource for VMware #953

Merged

Conversation

akutz
Copy link
Contributor

@akutz akutz commented Jul 23, 2021

Proposed Commit Message

Datasource for VMware

This patch finally introduces the Cloud-Init Datasource for VMware
GuestInfo as a part of cloud-init proper. This datasource has existed
since 2018, and rapidly became the de facto datasource for developers
working with Packer, Terraform, for projects like kube-image-builder,
and the de jure datasource for Photon OS.

The major change to the datasource from its previous incarnation is
the name. Now named DatasourceVMware, this new version of the
datasource will allow multiple transport types in addition to
GuestInfo keys.

This datasource includes several unique features developed to address
real-world situations:

  * Support for reading any key (metadata, userdata, vendordata) both
    from the guestinfo table when running on a VM in vSphere as well as
    from an environment variable when running inside of a container,
    useful for rapid dev/test.

  * Allows booting with DHCP while still providing full participation
    in Cloud-Init instance data and Jinja queries. The netifaces library
    provides the ability to inspect the network after it is online,
    and the runtime network configuration is then merged into the
    existing metadata and persisted to disk.

  * Advertises the local_ipv4 and local_ipv6 addresses via guestinfo
    as well. This is useful as Guest Tools is not always able to
    identify what would be considered the local address.

The datasource currently lives in its own GitHub repository at
https://github.com/vmware/cloud-init-vmware-guestinfo. Once the datasource
is merged into Cloud-Init, the old repository will be deprecated.

Additional Context

  • The primary author and current steward of this datasource spoke at Cloud-Init Con 2020 where there was interest in contributing this datasource to the Cloud-Init codebase.
  • This datasource has an active community that provides immense support to the project. Cc @codenrhoden, @toabctl, @jaymzmac, @prologic, @MnrGreg, @janitha, @powersj

Test Steps

Checklist:

  • My code follows the process laid out in the documentation
  • I have updated or added any unit tests accordingly
  • I have updated or added any documentation accordingly

@akutz
Copy link
Contributor Author

akutz commented Jul 23, 2021

It says the CLA validate failed. I did sign it (this morning). I suppose it just is not approved yet?

@powersj
Copy link
Contributor

powersj commented Jul 23, 2021

Yeah the CLA check takes 1-2 days on our end, so probably early next week

@akutz
Copy link
Contributor Author

akutz commented Jul 23, 2021

Thanks @powersj ,

I will fix the lint issues. I also thought I removed the dependency on deepmerge. It would be ironic if that's a PR I did not yet...merge :)

@akutz
Copy link
Contributor Author

akutz commented Jul 26, 2021

Hi @powersj,

I was looking to see if it possible to have a datasource respond to two different names, and while that does not appear to be the case, I did find that this datasource no longer fully matches the DS docs. I am going to update this DS with an RST doc instead of the MD one I included, as well as determine which of the current methods the DS is not implementing. While I know the DS works, I'd rather it be merged at a baseline that is congruent with the current criteria. Thanks!

@TheRealFalcon
Copy link
Member

With regards to the CLA, I can see that you've signed the CLA, but you haven't yet added yourself to the CLA signers file. As part of this PR, please also add your name (alphabetically) to the CLA signers file. The full details are described in the last bullet point of the documentation.

Based on your last comment, it sounds like there's still more work you want to do as part of this PR. Are you looking for review feedback now, or would you like to make the updates first?

@akutz
Copy link
Contributor Author

akutz commented Jul 27, 2021

With regards to the CLA, I can see that you've signed the CLA, but you haven't yet added yourself to the CLA signers file. As part of this PR, please also add your name (alphabetically) to the CLA signers file. The full details are described in the last bullet point of the documentation.

Based on your last comment, it sounds like there's still more work you want to do as part of this PR. Are you looking for review feedback now, or would you like to make the updates first?

Thanks for the question! Well, you can provide feedback if you like :)

I think all I'm going to do for now is:

  • Rename this to "DataSourceVMware" to follow the existing pattern of other DS names
  • Change the MD to an RST per the docs

Anything else can come later, or as part of the review. I realize I'm missing some functions in the DS prototype per the current documentation. I'd rather y'all let me know what is missing. Some things are a result of maintaining Py2 compat, but that's no longer a concern once merged into CI proper.

@akutz akutz force-pushed the feature/datasource-vmware-guestinfo branch from 4c7cd9c to 1c9b6f5 Compare July 27, 2021 18:38
@akutz akutz changed the title Datasource for VMware GuestInfo Datasource for VMware Jul 27, 2021
@akutz
Copy link
Contributor Author

akutz commented Jul 27, 2021

Okay, I've updated the name and docs. Have at ye! :)

@akutz akutz force-pushed the feature/datasource-vmware-guestinfo branch 2 times, most recently from 5621f8a to ad42b84 Compare July 27, 2021 20:50
Copy link
Member

@TheRealFalcon TheRealFalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good overall. I'm used to seeing cloud-init retrieving the network information from the datasource to apply to the machine and not the other way around, so this is a bit unique in that regard.

I left a number of inline comments, but most are pretty minor. The one major thing is the use of netifaces. We're a bit wary of adding new dependencies to cloud-init. We have some similar networking reporting tools in cloudinit/netinfo.py. Would you be able to use the functions there instead of pulling in netifaces?

Additionally, there are a few places where you reference support for older versions of cloud-init. Since this datasource will ship as part of whatever version of cloud-init we're shipping, compatibility with older versions won't be a concern anymore. I highlighted these inline as well.

You may want to add VMware to Supported Private Clouds section in README.md.

Each datasource has its own set of unit tests under tests/unittests/test_datasource. Would you be able to add your own?

doc/rtd/topics/datasources/vmware.rst Outdated Show resolved Hide resolved
cloudinit/sources/DataSourceVMware.py Outdated Show resolved Hide resolved
cloudinit/sources/DataSourceVMware.py Outdated Show resolved Hide resolved
cloudinit/sources/DataSourceVMware.py Outdated Show resolved Hide resolved
cloudinit/sources/DataSourceVMware.py Show resolved Hide resolved
cloudinit/sources/DataSourceVMware.py Outdated Show resolved Hide resolved
cloudinit/sources/DataSourceVMware.py Show resolved Hide resolved
cloudinit/sources/DataSourceVMware.py Outdated Show resolved Hide resolved
cloudinit/sources/DataSourceVMware.py Outdated Show resolved Hide resolved
cloudinit/sources/DataSourceVMware.py Show resolved Hide resolved
@akutz
Copy link
Contributor Author

akutz commented Jul 27, 2021

This looks good overall. I'm used to seeing cloud-init retrieving the network information from the datasource to apply to the machine and not the other way around, so this is a bit unique in that regard.

I left a number of inline comments, but most are pretty minor. The one major thing is the use of netifaces. We're a bit wary of adding new dependencies to cloud-init. We have some similar networking reporting tools in cloudinit/netinfo.py. Would you be able to use the functions there instead of pulling in netifaces?

Additionally, there are a few places where you reference support for older versions of cloud-init. Since this datasource will ship as part of whatever version of cloud-init we're shipping, compatibility with older versions won't be a concern anymore. I highlighted these inline as well.

You may want to add VMware to Supported Private Clouds section in README.md.

Each datasource has its own set of unit tests under tests/unittests/test_datasource. Would you be able to add your own?

Hi @TheRealFalcon,

Thank you for the review! Before I respond/address/fix anything else, I wanted to comment on the netifaces remark. I raised this in cloud-init con 2020, and the leadership there did not think it would be an issue.

Looking at cloudinit/netinfo.py, I noticed it shell execs out to commands like ip, ifconfig, and netstat to gather most of its information. However, this is not as portable as C-level interop implemented by netifaces. Additionally, it's unclear to me if netinfo.py supports determining the default gateway for a particular device or address family the way netifaces does and on which this datasource relies.

Netifaces supports Windows as well. While that's not relevant here per se, Cluster API (CAPI), the Kubernetes project for bootstrapping Kube clusters with Kubernetes, supports Linux and Windows nodes. The bootstrap controller produces cloud-init payloads for Linux guests, and in order to support Windows, the CAPI team worked with Microsoft to port this datasource to cloudbase-init in order to make it as seamless as possible for CAPI and its infrastructure providers (Azure, AWS, VMware, etc.) to support bootstrapping Windows nodes.

That's a long way to say, can I pretty please continue to use netifaces, if only for the initial merge and then perhaps figure out a future plan for removing the dependency? It's just a lot of up front work to change now. I'm not opposed to it, but as you are aware, any changes, no matter how small, require revalidation. I know what we have works, and while new unit tests might show some other method works, nothing is as definitive as two years of production workloads.

@TheRealFalcon
Copy link
Member

That's a long way to say, can I pretty please continue to use netifaces, if only for the initial merge and then perhaps figure out a future plan for removing the dependency?

Only because you said pretty please 😉 .

I remember the conversations about netifaces and I also remember concluding that it shouldn't be too hard to replace it, so that's why I suggested it here. Your reasoning makes sense though, and I think it's ok to keep it currently, but I would also like to get a +1 from @blackboxsw on this.

@akutz akutz force-pushed the feature/datasource-vmware-guestinfo branch from ad42b84 to 6e50d86 Compare July 28, 2021 17:30
@akutz
Copy link
Contributor Author

akutz commented Jul 28, 2021

Hi @TheRealFalcon,

Each datasource has its own set of unit tests under tests/unittests/test_datasource. Would you be able to add your own?

Absolutely, of course. The only reason I did not was because I'm not sure how to mock the RPC calls. I can create tests for everything post the actual retrieval of the data. The other challenge is the fact the datasource relies on inspecting the host's network configuration. I am not sure how I should mock that so the answer is deterministic enough to be testable.

Do you have suggestions on how to do either? FWIW, this is why testing for me has always been firing up a VM on my desktop in Fusion and verifying the outcome. The datasource's RPC calls via vmware-rpctool will read data placed in the VM's vmx config file on disk. Here's an example of how I tested it in kube-image-builder.

I'm more than happy to add unit tests if we can figure out how to address the two items above. Heck, I'm happy to add some even without addressing those items, but given those items should be tested, I would sure appreciate helping figuring out how to do so.

@akutz akutz force-pushed the feature/datasource-vmware-guestinfo branch from 6e50d86 to 31cf845 Compare July 28, 2021 18:52
@TheRealFalcon
Copy link
Member

Absolutely, of course. The only reason I did not was because I'm not sure how to mock the RPC calls. I can create tests for everything post the actual retrieval of the data. The other challenge is the fact the datasource relies on inspecting the host's network configuration. I am not sure how I should mock that so the answer is deterministic enough to be testable.

In general there's some helper base classes that do a lot of the mocking for you. Just as an example, if you take a look at test_digitalocean.py, you can see it inherits from CiTestCase and with very little additional setup you should be able to instantiate a datasource and start interacting with it. Beyond that, I think it's just mocking the individual parts of the datasource that make external calls or rely on specific instance state. In your case, you could probably mock get_guestinfo_value, or perhaps the subp/env vars within it. You'd also need to mock some of the netifaces/ip_address calls.

Does that help at all?

FWIW, this is why testing for me has always been firing up a VM on my desktop in Fusion and verifying the outcome.

Yeah, unit testing is frustrating when we have to mock so much. It definitely helps in regression testing though. We do also have an integration testing framework that lets us do some more real-world test cases. These tests rely heavily on another library called pycloudlib. The scope would be beyond this PR, but if we could make VMware a supported "cloud" in pycloudlib, it would it very simple to write an integration test for this datasource that is end-to-end and doesn't require any mocking. I'll admit that I'm fairly oblivious to the VMware ecosystem, so I would need to come up to speed on how the VMs are launched and managed, but if you're interested in more representative tests, this would be the way to go. It'd be a welcome addition into pycloudlib.

@akutz akutz force-pushed the feature/datasource-vmware-guestinfo branch from 31cf845 to bff7029 Compare July 29, 2021 21:51
@akutz
Copy link
Contributor Author

akutz commented Jul 29, 2021

HI @TheRealFalcon,

Holy. crap.

@mock.patch is magic. It's pure magic. Seriously. This is amazing.

@akutz akutz force-pushed the feature/datasource-vmware-guestinfo branch 5 times, most recently from 6b120ee to 3e994d3 Compare July 30, 2021 20:20
akutz added a commit to akutz/cloud-init-vmware-guestinfo that referenced this pull request Aug 11, 2021
This patch adds a note to the README that indicates the repo is
now deprecated since the datasource has been merged into cloud-
init with canonical/cloud-init#953.
blackboxsw pushed a commit that referenced this pull request Aug 12, 2021
VMware support was added under #953.
TheRealFalcon pushed a commit to TheRealFalcon/cloud-init that referenced this pull request Aug 12, 2021
VMware support was added under canonical#953.
Also includes the python3-netifaces dependency in d/control
TheRealFalcon pushed a commit to TheRealFalcon/cloud-init that referenced this pull request Aug 12, 2021
VMware support was added under canonical#953.
Also includes the python3-netifaces dependency in d/control
TheRealFalcon pushed a commit to TheRealFalcon/cloud-init that referenced this pull request Aug 12, 2021
VMware support was added under canonical#953.
Also includes the python3-netifaces dependency in d/control
TheRealFalcon pushed a commit to TheRealFalcon/cloud-init that referenced this pull request Aug 12, 2021
VMware support was added under canonical#953.
Also includes the python3-netifaces dependency in d/control
blackboxsw pushed a commit that referenced this pull request Aug 12, 2021
VMware support was added under #953.
Also includes the python3-netifaces dependency in d/control
blackboxsw pushed a commit that referenced this pull request Aug 12, 2021
VMware support was added under #953.
Also includes the python3-netifaces dependency in d/control
blackboxsw pushed a commit that referenced this pull request Aug 12, 2021
VMware support was added under #953.
Also includes the python3-netifaces dependency in d/control
holmanb pushed a commit to holmanb/cloud-init that referenced this pull request Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants